Segmentation of toddler nutritional status using REBUS and FIMIX partial least square in Southeast Sulawesi

Nutrition is one of the important factors that play a major role in the growth and development of children so that they can develop optimally. Child malnutrition, such as stunting, underweight, and wasting, is a significant problem in Indonesia. The World Health Organization (WHO) determined that the nutritional status of children under five in Indonesia is in the chronic category, one of which is in Southeast Sulawesi Province. This study examines and analyzes the factors that influence the nutritional status of children under five in Southeast Sulawesi using the Partial Least Square Structural Equation Modeling (SEM-PLS) method and then segments the nutritional status of children under five using Response Based Unit Segmentation Modeling in Partial Least Square (REBUS-PLS) and Finite Mixture Partial Least Square (FIMIX-PLS). The number of observations in this study was 216 sub-districts. From the results of the SEM-PLS analysis conducted, it was concluded that the 10 indicators used were valid and significant in describing the latent variables, and the practice factor variable had an effect on the food factor variable, the food factor variable had an effect on the service factor variable, and the service factor variable had an effect on the under-five nutritional status variable. The REBUS-PLS analysis results in two segments, with one segment of 75 observations and the other segment of 141 observations. The same conclusion is obtained as in the SEM-PLS analysis, but the results of the analysis with REBUS-PLS have a greater value than the results of the SEM-PLS analysis. Key points of the article: 1. Comparing REBUS-PLS and FIMIX-PLS methods for overcoming the case of heterogeneity in dat.2. Combining the SEM-PLS method with the REBUS and FIMIX methods in discussing the factors that influence the nutritional status of children under five in Southeast Sulawesi.


Introduction
Nutrition is an important factor that must be considered during the growth period of children.The growth period of children is very closely related.Providing insufficient nutritional intake in the womb will adversely affect the growth of children in the future, and can even cause death.In addition to causing death, malnutrition can also cause children to experience several growth problems such as stunting, underweight and wasting.Stunting is a condition of failure to thrive in infants under five years old (toddlers) due to chronic malnutrition so they have a shorter height than their age.Underweight is a condition where children under five years old weigh less than their peers due to malnutrition.Meanwhile, wasting is a growth disorder where children have less weight for height than their peers.Malnourished toddlers will be more susceptible to disease and have a level of intelligence that is not maximized, so that it can risk reducing future productivity levels [1] .
In 2020, the distribution rate of the three growth problems in toddlers in Indonesia will already be in the acute chronic category.According to the World Health Organization (WHO), the maximum number of public health problems in a country should not exceed 20% to avoid problems that can be considered chronic [2] .Southeast Sulawesi is one of the provinces with under-five nutritional status problems that fall into the chronic-acute category, where the distribution rate of stunting is 30.2%,underweight is 20.9%, and wasting is 6.6%.With the high distribution of nutritional status problems, research is needed to determine the factors that affect the nutritional status of toddlers in Southeast Sulawesi.
Nutritional status in toddlers is the state of the body as a result of food consumption and the use of nutritious substances, differentiated between undernutrition, good nutrition, and overnutrition [3] .There are several basic factors that influence the nutritional status of toddlers, namely food, service, and practice factors.Food factors include the basic nutritional needs of pregnant women.Then the service factor includes services for mothers and babies as an action for health checks.Then the nutritional status of toddlers is also influenced by practical factors but indirectly.Based on the existing factors, these factors cannot be measured empirically, so indicators are needed that can measure the influence of these factors using methods that can handle multidimensional problems.
Based on the above problems, a method is needed that can measure and explain the relationship between factors and indicators that affect the nutritional status of toddlers.Structural Equation Modeling (SEM) is a suitable method to handle existing problems.This is because SEM is a statistical method that can determine the pattern of relationships between construct variables or latent variables and their indicators [4] .The SEM method has several conditions that must be met, namely, multivariate normal distributed data, observations must be independent of each other, indicators must be reflective, and the number of sample units used must be large and free from multicollinearity.If there are assumptions that are not met, the statistical model obtained will be less good [5] .However, in reality, real data collection in the field often shows abnormal data patterns due to uneven data collection that causes outliers, and the number of samples obtained in the field does not meet the requirements.So that a method is needed that can overcome the limitations of these assumptions, a variant-based SEM method called Partial Least Square Structural Equation Modeling (SEM-PLS) was developed by Herman Wold in 1966 [6] .The use of the SEM-PLS method is more flexible and powerful than ordinary SEM because it can be done on a small amount of data and free from multivariate normal assumptions.
In SEM-PLS, the sample taken is assumed to come from a homogeneous population.However, in reality, the population is not always homogeneous, so it can lead to invalid conclusions.In this study, to overcome these problems, the Response Based Unit Segmentation in Partial Least Square (REBUS-PLS) and Finite Mixture Partial Least Square (FIMIX-PLS) methods were used, which are methods to detect and analyze unobserved heterogeneity.REBUS-PLS and FIMIX-PLS analyses are performed by segmenting the observation units and estimating the parameters of each local model on the segments formed.
This study will model the factors that influence the nutritional status of children under five in Southeast Sulawesi Province using the SEM-PLS approach, then use REBUS-PLS and FIMIX-PLS to group regions in Southeast Sulawesi Province based on the nutritional status of children under five in 2020.
In assessing their influence, all factors affecting health status cannot be measured directly because these factors are variables that cannot be observed directly (latent variables) and are built from several indicator variables.Structural Equation Modeling (SEM) is used to represent the relationship between these latent variables.This SEM model has often been used to examine the relationship between latent variables [5][6][7] .To find the appropriate segments, we combined the SEM-PLS method with the REBUS and FIMIX methods [ 8 , 9 ].This procedure is performed by segmenting the observation units and estimating the parameters of each local model on the formed segments.For REBUS-PLS, observation units are grouped based on similar behavior or similar performance [7] .While the grouping of observation units in FIMIX-PLS are grouped based on the strata.
SEM is a multivariate technique that combines aspects of factor analysis and multiple regression that allows research to be conducted simultaneously to examine the relationship between latent variables and latent variables with measured variables.SEM can test the structural equation model, which is the relationship between endogenous latent variables and exogenous latent variables.In addition, it can be used to test the measurement model, namely the relationship between indicator variables and latent variables, which shows the magnitude of the correlation between the indicator and the latent variable it describes [6] .SEM is not used to design a theory but is more intended to examine and justify a hypothesis model based on the justification of the theory described in the form of a path diagram [7] .The SEM model has three main components, namely variables, models, and errors.
There are two types of variables in the SEM method, namely latent variables and manifest variables.Latent variables are variables that cannot be measured, while manifest variables are variables that can be measured empirically.Latent variables consist of two types of variables, namely exogenous latent variables ( ) and endogenous latent variables ( ) [10] .Exogenous latent variables are independent variables in an equation, and endogenous latent variables are dependent variables in an equation.
In the SEM method, there are two types of models, namely structural models and measurement models.The structural model is a model that describes the relationship between latent variables.The measurement model is a model that describes the relationship between latent variables and manifest variables.The measurement model is carried out by factor analysis, i.e., each latent variable is modeled as a general factor of its measurement.The SEM method has two types of errors, namely structural error and measurement error.

Structural equation modeling partial least square (SEM-PLS)
SEM-PLS is an alternative SEM approach that shifts from covariance-based to variance-based.The SEM-PLS design is intended to overcome the limitations of the SEM method when data experience problems such as measuring data with a certain scale, small sample size, missing values, abnormal data, and multicollinearity.In addition, SEM-PLS can be used on any type of data scale (nominal, ordinal, interval, or ratio) and has more flexible assumption requirements [8] .The measurement model consists of two models, namely the reflexive model and the formative model.The reflexive model is the relationship between manifest variables and latent variables, where if there is a change in the latent variable, it will result in a change in the manifest variable, but not vice versa.The reflexive model, also known as the principal factor model, has the following equation: (1) The formative model is the opposite of the reflexive model, where if there is a change in the indicator variable, there will be a change in the latent variable.The formative model equation is as follows: The structural model is a model that describes the relationships that exist between latent variables based on substantive theory.Inner models are usually referred to as inner relations.The inner model equation is as follows: Parameter estimates are obtained through a three-stage iteration process consisting of a weight estimate, a path estimate, means, and location parameters.The first stage of the weight estimate is used to create a score of latent variables [4] .Weight estimates for parameters in the reflexive model for exogenous latent variables are obtained by the least squares method by minimizing the sum of squared errors  ℎ , so that the weights for the exogenous reflexive model are obtained as follows: Meanwhile, the weight of the endogenous reflexive model is as follows: The parameter estimation of the formative model is as follows: Path estimation is used to obtain parameter coefficient values for each latent variable.Measurement model estimation yj from standardized latent variables ( j -mj ) with mean 0 and standard deviation = 1, estimated by a linear combination of the centre of manifest variables while structural model estimation is obtained from standardized latent variables ( j -mj ).The inner model weights e ji are selected through three schemes, the path schemes, centroid scheme and factor scheme.
The estimated means and location parameters are the regression constants of the indicators and other variables, with the following equation:

Bootstrap
The bootstrap method is a non-parametric estimation method that can estimate the parameters of a distribution, the variance of the sample median, and the error rate.In bootstrapping, the same steps are required repeatedly to estimate the shape of the sampling distribution.Sampling using the bootstrap method is carried out by returning the data sample (resampling with replacement).PLS with small samples requires resampling with the bootstrap standard error method to obtain significance values and estimates of the measurement model and structural model that converge by finding estimates of standard error [9] .

Hypothesis testing
The hypothesis tested is the hypothesis in the outer model ( ) and the hypothesis in the inner model (  and ) based on the parameter estimation results so that it is known whether the variables have a significant effect or not.The hypothesis on the outer model ( ) is as follows.
H 0 :  ℎ = 0 , h = 1,2,…, p (The loading factor of the h th indicator is not significant in measuring the latent variable) H 1 :  ℎ ≠ 0 , h = 1,2,…, p (The loading factor of the h th indicator is significant in measuring the latent variable.) where λℎ is the estimated value of  ℎ and  ( λℎ H 0 :   = 0 (The i th endogenous latent variable is not significant in measuring other endogenous latent variables.)H 1 :   ≠ 0 (The i th endogenous latent variable is significant in measuring other endogenous latent variables.)

Gamma Parameter ( 𝛾)
H 0 :   = 0 (The i th exogenous latent variable is not significant in measuring the endogenous latent variable.)H 1 :   ≠ 0 (The i th exogenous latent variable is significant in measuring the endogenous latent variable.) Test statistics H0 rejected if | | >  ∕ 2 , or p-value < .

SEM-PLS model evaluation
The model evaluation stage in SEM-PLS includes the evaluation of the measurement model (outer model) and the structural model (inner model).The evaluation of the measurement model is based on the following values: 1. Convergent validity is seen based on item reliability and construct reliability.Based on item reliability, a correlation can be said to meet convergent validity if it has a loading factor value greater than 0.5 to 0.6 and the t-test value obtained from the bootstrapping process is greater than the t-table at a certain , as in this study using  = 10% [8] .Meanwhile, based on construct reliability, it is expected that the Cronbach alpha value is > 0.7 for all constructs.The Cronbach alpha value is obtained from where var ( 2. Discriminant The validity of the reflexive measurement model can be calculated based on the cross-loading value of the manifest variable on each latent variable.In addition, discriminant validity can also be calculated using the square root value of average variance extracted (AVE), which can be said to be good if the AVE is greater than 0.5.
3. Composite reliability: latent variables can be said to have good reliability if the composite reliability value is greater than 0.6.
4. Variance Inflation Factor (VIF): if the VIF value is above 10, it indicates that there is multicollinearity.
The structural model was evaluated using the following statistical measures: 1. R-squared (R 2 ): an R 2 value of 0.75 can be categorized as strong or indicates that the model is well formed.Meanwhile, an R 2 value of 0.5 is categorized as moderate (medium), and an R 2 value of 0.25 means the model is categorized as weak [10] .
2. Q-square Predictive relevance is used to validate the predictive ability of the model.If this value is greater than 0, it indicates that the exogenous latent variable is suitable as an explanatory variable that can predict the endogenous variable [8] .
The evaluation of the overall measurement and structural models can be seen based on the goodness of fit index (GoF index) value as follows: The model is said to be able to explain empirical data well if the GoF value is greater than 0.36 [11] .

Agglomerative hierarchical cluster analysis
The agglomerative hierarchical cluster method is a clustering technique that starts with individual objects.These objects are then grouped based on similarity or dissimilarity until each object forms a cluster with similar characteristics in each cluster.The results of the method can be depicted in a two-dimensional diagram called a dendrogram.A dendrogram is a diagram that illustrates the stages of the hierarchical clustering process.The vertical axis of the dendrogram represents the homogeneity value or distance between groups, and the horizontal axis represents the grouped objects.The grouping is based on the distance between objects, which can be divided into four methods: single linkage, complete linkage, and average linkage [12] .

Metode ward's
Ward's method is a hierarchical clustering method that is based on minimizing information lost due to merging two groups.Ward's method, in other words, minimizes the increase in the sum square of error (SSE) criterion.The two groups that have the minimum increase in SSE will be grouped.When all groups combine into one group of N objects, calculating the distance between the two groups using Ward's method is as follows [12] .

Response based unit segmentation in partial least square (REBUS-PLS)
REBUS-PLS is an advanced method of SEM-PLS that can group or segment observation units as well as estimate the parameters of each local model in the segments formed.The unit of observation is expressed based on similarities in behavior or performance.SEM-PLS assumes that the samples taken come from a homogeneous population.In reality, statistical data is not always homogeneous, which can lead to biased analysis results that result in invalid conclusions.REBUS-PLS is a method that can detect heterogeneity, where heterogeneity can be detected by grouping observation units.Grouping of observation units is based on a measure of closeness or distance of an index called the closeness size index (CM index), which is a goodness of fit index (GoF index) structure calculated from the residual commonality model described in the following equation.
where  2  is the model residual size for the i th unit in the g th latent class, which refers to the q th indicator variable in the j th block, while  ( ξ   ) explains the communal index of the q th variable from the j * th block in the g th latent class.In addition,  2   *  shows the residual of the structural model for the i th unit in the g th latent class, which refers to the j * th endogenous block, R 2 is the coefficient of determination from the model equation, and n g shows the number of units from the g th latent class and m g shows the number dimensions [13] .
To obtain a better local model than the global model, the selected closeness measure is determined according to the structure of the Goodness of Fit (GoF) index.The GoF index, according to Trinchera (2007), is defined as follows: where J is the number of latent variables in the model, and J * is the number of endogenous latent variables in the model. (  , ξ ) is the correlation between the q th manifest variable of the j th block and the corresponding score of the latent variable.
Clustering in REBUS-PLS aims to overcome unobserved heterogeneity by determining groups or clusters of observation units with similar behaviour and performance.In addition, because CM is defined based on the structure of the GoF index, the local model formed will show greater GoF and R 2 values.The following are the steps in the REBUS-PLS method.Iteration continues until there is no change in the composition (observations included in the group or class) of the class or until the stop rule is reached.According to Trinchera (2007), the stop rule is when the difference in class composition is less than 0.05%, and usually, the iteration will converge in less than 15 iterations.

Finite mixture partial least square (FIMIX-PLS)
In SEM-PLS applications, uncovering unobserved heterogeneity is critical to the quality of result presentation and interpretation [14] .Becker [15] explained that SEM is usually used for research purposes to model relationships between latent and manifest variables.Most of its applications assume that the data comes from a homogeneous population.However, the assumption of homogeneity is unrealistic.Each individual has different perceptions and evaluations of latent variables.The analysis applied if it does not pay attention to these segmentation differences will result in biased analysis, and the conclusions drawn will be invalid, so a method that can detect heterogeneity problems in the SEM model is needed.
Hahn et al. ( 2002) [16] developed the FIMIX-PLS method.This approach combines a finite mixture procedure with an EM algorithm specifically related to ordinary least squares (OLS) based on PLS predictions.Sarstedt (2008) [17] reviewed segmentation techniques for PLS path modelling and stated that FIMIX-PLS can be used as a comprehensive approach to determine heterogeneity in PLS path modeling.The statistical criterion used to indicate the best number of segments in FIMIX-PLS is known as entropy (EN).EN is a criterion used to analyze the class specification results of FIMIX-PLS, whose value is between 0 and 1.The higher the EN value, that is, the closer to 1, the better the quality of the separator and the model can be interpreted [16] .The assumption in FIMIX-PLS is that if the observation units have been separated according to their strata, then the case of heteroginity will not occur in the structural model.
The equation formed based on the relationship contained in the inner model is as follows: FIMIX-PLS estimation is based on the assumption that heterogeneity occurs in the structural model, assuming i is finite mixture distributed with a multivariate normal density function   | ( . ) .k classes or segments with k = 1, 2, 3, …, K Model estimation in FIMIX-PLS follows the likelihood principle.The likelihood function in FIMIX-PLS is maximized by the Expectation-Maximization (EM) algorithm.The EM algorithm is a combination of the expectation (E) step and the maximization (M) step.The E-step produces an expected log-likelihood function that is used for parameter estimation.The M-step calculates the parameters by maximizing the expected log-likelihood of the E-step [4] .
The optimum number of segments in FIMIX-PLS cannot be known for several reasons, including that the mixture model is not asymptotically chi-square distributed and valid for likelihood ratio tests.Hahn et al. (2002) suggested repeating the FIMIX-PLS operation with consecutive numbers of latent classes K (e.g., 1-10).Criteria such as Akaike's Information Criterion (AIC), Bayesian Information Criterion (BIC), Consistent AIC (CAIC), and Normed Entropy Criterion (EN) are used to determine the number of data segmentations.The formulas used to calculate the AIC, CAIC, BIC, and EN criteria are as follows [16] .
where c is a constant, and   is the number of parameters.
R is the number of predictor variables in the inner model regression model.
in this equation the value of c in Eq. ( 25) is ln  .
in the above equation the value of c in Eq. ( 25) is (ln ( ) + 1 ) with EN: Normal Entrophy, relative size between 0-1   : probability of the i th observation in the k th class k: class or segment with k = 1, 2, …, K i: i th observation with i = 1, 2, …, I

Research methodology
This study uses secondary data obtained from the Southeast Sulawesi Provincial Health Office 2020 on the nutritional status of children under five.The unit of observation used is the sub-district level in Southeast Sulawesi Province, which consists of 216 sub-districts.The research conceptual framework is presented in the following figure: In the conceptual framework of Nutritional Status of Toddlers according to UNICEF in 2020 in Fig. 1 , there are three basic factors that can affect the nutritional status of toddlers, namely food factors, service factors, and practice factors.This study uses three basic factors with indicators from each factor tailored to the variables in the research data.The nutritional status of toddlers in the study used was stunting, underweight, and wasting.Indicators of the food factor are pregnant women getting Fe1 tablets and Fe3 tablets.Indicators of the service factor are babies with low birth weight, pregnant women with chronic energy deficiency, anaemic pregnant women, and complete neonatal visits.Indicators from the practice factor of the number of posyandu will help mothers and babies do health checks to avoid malnutrition.Details are presented in the following table.
The analysis step carried out in this study consists of two stages, namely the modeling analysis stage using SEM-PLS to answer the first research objective and the clustering analysis stage using the REBUS-PLS method to answer the second research objective.6. Hypothesis testing of measurement models and structural models using the bootstrap resampling method with hypothesis and t-test statistics.
After obtaining the model with the SEM-PLS method, segmentation is continued with the REBUS-PLS and FIMIX-PLS methods.The REBUS-PLS method analysis steps are as follows: 1. Conduct agglomerative hierarchical cluster analysis based on structural and communal residuals from the global model of the first-stage analysis results.Next is to conduct clustering analysis with the FIMIX-PLS approach.
1. Estimating the path model based on the PLS algorithm.2. The factor score value of a significant model is obtained from the latent variable factor score value in the inner model.The factor score value in the inner model is used for the FIMIX-PLS procedure, namely determining the number of groups.3. Expost analysis and assessment of explanatory variables for segmentation.4. Comparing the previous data segmentation with the predicted results of the path model.5. Evaluating and interpreting the results of PLS path segmentation.6. Draw conclusions and make suggestions.

Results
Descriptive statistics will be used to determine the characteristics of each factor affecting the nutritional status of children under five in Southeast Sulawesi in 2020, with 216 sub-districts as observation units.The following are the characteristics of the factors that affect the nutritional status of children under five.
Table 2 shows that the average value of the three nutritional status problems of children under five years old is stunting, with the highest number of cases, followed by underweight and wasting.Stunting has an average number of cases of 68.866, wasting is 19.278 cases, and underweight is 49.042.This shows that the number of cases of under-five nutritional status problems in Southeast Sulawesi Province is still quite large.The sub-districts with the highest number of cases are the Mawasangka sub-district for stunting cases, the Binongko sub-district for wasting cases, and the Pasar Wajo sub-district.Then there are 29 sub-districts with 0 stunting cases, 33 sub-districts with 0 wasting cases, and 27 sub-districts with underweight cases.
The number of posyandu in Southeast Sulawesi Province as a manifest variable of the latent variable practice factor has an average of 9.231, so it can be said that the number of posyandu in each sub-district is quite low.There are 77 out of 216 sub-districts that do not have posyandu, and the sub-district with the highest number of posyandu is the Wangi-Wangi sub-district, with 54 existing posyandu.

SEM-PLS analysis
PLS-SEM analysis begins by forming a path diagram by designing a measurement model and structural model first in accordance with the conceptual framework that has been built.In this study, the structural model consists of 1 exogenous latent variable, namely the practice factor ( 1) or (X), and 3 endogenous latent variables, namely food factors ( 1) or (Y1), service factors ( 2) or (Y2), and nutritional status of toddlers ( 3) or (Y3).The measurement model in this study uses a reflective measurement model where manifest variables are influenced by latent variables.Furthermore, the relationship between latent variables and their indicators and the relationship between exogenous and endogenous latent variables is described through a path diagram.The path diagram can be shown in Fig. 2

below.
Measurement model analysis consists of testing the validity and reliability of each indicator against its latent variable or measurement model evaluation.Validity and reliability tests aim to determine whether the indicators used are valid and reliable for measuring each latent variable.The validity assessment is based on the value of the loading factor, which shows the correlation of each indicator with its latent variable.Discriminant validity is determined by looking at the square root value of the average variance extracted (AVE).If the AVE root value is > 0.50, it can be said that the latent variable has good discriminant validity.Reliability testing is carried out to determine the reliability of latent variables as a measuring instrument.Latent variables are said to have high reliability as measuring instruments if they have a Cronbach's alpha value ≥ 0.60 and composite reliability ≥ 0.70.The measurement model estimation results are presented in the following table.
Table 3 shows that all indicators of each latent variable have a factor loading value above 0.5 with a T-statistic greater than t-table = 1.96 and a p-value < 0.05, so the indicator is valid and significant.Furthermore, it also provides a composite reliability value above the cut-off value of 0.7, so it can be said that all latent variables are reliable.The practice factor (X) is formed by the indicator number of posyandu (X1.1).The food factor (Y1) is formed by the indicators of the number of pregnant women getting Fe1 tablets (Y1.1) (0.992) and the number of pregnant women getting Fe3 tablets (Y1.2) (0.992).Service Factor (Y2) dominant indicators are the number of complete neonate visits (Y2.4) (0.884), the number of pregnant women with chronic energy deficiency (CHD) (Y2.2) (0.752), and the number of babies with low birth weight (LBW) (Y2.1) (0.727).The dominant nutritional status of toddlers (Y3) is an indicator of the number of underweight toddlers (Y3.3) (0.958), stunting toddlers (Y3.1) (0.919), and wasting toddlers (Y3.2) (0.913).
The results of the original estimation and the bootstrap estimation, B = 500, are presented in the following figure.
The results of testing the complete model can be seen from the R-Square value, which describes the goodness-of-fit of a model.The recommended R-square value is greater than zero.The R-square value is presented in Table 4 below: Table 4 explains that the contribution or proportion of the variable Practice Factor (X) to the Food Factor (Y1) is 0.148, the variable Food Factor (Y1) to the Service Factor (Y2) is 0.686, and the Service Factor (Y2) to the Nutritional Status of Toddlers (Y3) is 0.078.The results of all R-Square values show that all R-Square values are greater than zero.This means that this research model has    met the required Goodness of Fit.Based on the f2 value, it can be seen that the service factor latent variable has a small influence, the food factor latent variable has a large influence, and the practice factor latent variable has a moderate influence.
The results of the calculation of the Q-square value from Table 4 are as follows: It can be interpreted that the model can explain the nutritional status of toddlers (Y3) by 75.3%, and 24.7% is explained by other variables outside the model.The structural equation is presented below: The path coefficient test in Fig. 3 and the above equation in detail are presented in the following table: Based on Table 5 and the equation above, it can be interpreted as follows: 1. Practice factors have a significant and positive effect on food factors.This can be seen from the positive path coefficient of 0.390 with a T-statistics value of 4.104, which is greater than the t-table of 1.96 or p-value of 0.000, which is smaller than the significance level ( ) determined at 0.05.Thus, the practice factor has a direct effect on the food factor of 0.390, which means that every increase in the practice factor will increase the food factor by 0.390.2. Food factors have a significant and positive effect on service factors.This can be seen from the positive path coefficient of 0.829 with a T-statistics value of 34.855, which is greater than the t-table of 1.96, or the p-value of 0.000, which is smaller than the significance level ( ) determined at 0.05.Thus, the food factor has a direct effect on the service factor of 0.829, which means that every increase in the food factor will increase the service factor by 0.829.3. Service factors have a significant and positive effect on the nutritional status of children under the age of five.This can be seen from the positive path coefficient of 0.287 with a T-statistics value of 34.855, which is greater than the t-table = 1.96, or p-value = 0.000, less than the significance level ( ) determined at 0.05.Thus, the service factor has a direct effect on the nutritional status of children under five by 0.287, which means that every increase in the service factor will increase the nutritional status of children under five by 0.287.
PLS-SEM analysis assumes that there is heterogeneity in the observation data used; to overcome this, the REBUS-PLS and FIMIX-PLS methods are used.The initial stage in the REBUS-PLS analysis is to obtain the initial class and calculate the Closeness Measure Index (CM Index) value for each class formed, while in FIMIX-PLS, the Normed Entropy Criterion (EN) is used to determine the number of data segmentations.Initial class initialization used Ward's method of cluster analysis.
The initial class is obtained based on the results of hierarchical cluster analysis using Ward's method in the form of a dendrogram.Determining the best number of clusters formed is based on the following statistical criteria: Table 6 shows that the best number of clusters for the REBUS-PLS method is 2 clusters.This can be seen from the GQI value; the GQI Inner model provides the highest value.While in the FIMIX-PLS method, the best number of clusters is 2 clusters, this can be seen from the EN value in 2 clusters giving the highest value, which is 0.738.
Grouping observations in each cluster were formed based on the Closeness Measure Index (CM Index) value for the REBUS-PLS method and the EN value for FIMIX-PLS.Grouping is done by comparing the CM index values between segments; the segment with the smallest CM index value indicates that the observation is included in the segment.EN is a criterion used to analyze the results of class specifications from FIMIX-PLS, whose value is between 0 and 1.A higher EN value indicates that the observation is included in the segment (see Appendix 1 ).Based on the grouping results using the CM value, in segment 1, there are 75 sub-districts, and in segment 2, there are 141 sub-districts.The characteristics for each segment based on descriptive statistics are presented as follows: After cluster analysis, observations are grouped based on the probabilities of segment membership values.Grouping is done by comparing the probabilities of segment membership values between segments; the segment with the smallest probability of segment membership value indicates that the observation is included in the segment.The following is a mapping of the results of the clustering of observation units based on the CM index value and the EN index.
After grouping observations based on the CM index value, the REBUS-PLS results for 2 clusters form local model 1 (L1) and local model 2 (L2), which are compared with the results of SEM-PLS analysis as the global model (GM).The following is the path diagram for GM, L1, and L2.
Then the REBUS-PLS results will be compared with the PLS-SEM analysis results to see the comparison of the analysis results using PLS-SEM and REBUS-PLS.The following is a comparison of path diagrams for GM, L1, and L2.
Based on Table 8 , it can be seen from the path coefficient value that each relationship between latent variables has a positive influence.When viewed based on the path coefficient value for the relationship between practice factors and food factors, local model     Table 10 shows that the Q-square values in the REBUS and FIMIX method segments provide higher values than the PLS method, except for Segment 2 in the FIMIX method.Segment 1 in the REBUS and FIMIX methods has a larger R2 value than in the global model; this is in the food factor model that affects service factors.However, because the comparison between the R2 values for the local model and the global model is not much different, it does not mean that if the analysis is carried out using the original data without being grouped, it will produce poor results.However, analysis using data that has been grouped will produce greater results than without grouping, even though the difference is not so significant ( Figs. 4-6 , Tables 1 , 7 ).

Conclusion
Based on the results of the analysis and discussion that have been carried out, it can be concluded as follows: 1.The results of the SEM-PLS analysis with the measurement model show that the Posyandu indicator is a valid, reliable, and significant indicator for describing the practice factor.Pregnant women receiving Fe1 tablets and pregnant women receiving Fe3 tablets are valid and significant indicators in describing food factors.Then, infants with low birth weight (LWB), pregnant women with chronic energy deficiency (CED), anemic pregnant women, and neonate visits are valid and significant indicators in describing the service factor.Stunting, wasting, and underweight are valid and significant indicators in describing the nutritional status of children under five.Then, the results of the analysis with the structural model show that the practice factor significantly affects the food factor with a path coefficient value of 0.390, the food factor significantly affects the service factor with a path coefficient of 0.829, and the service factor significantly affects the nutritional status of children under five with a path coefficient value of 0.287.

1.
Calculate the structural and communality residuals from the global model to obtain initial groups or classes.The initial groups or classes were obtained based on the results of hierarchical cluster analysis of the remaining structural communalities in the form of a two-dimensional diagram, or dendrogram.2. Calculate the local model of each class or segment formed using SEM-PLS analysis.3. Recalculate the communality and structural residuals of each observation unit in each class formed.Then the CM value of each unit of each local model can be obtained.4. Insert the observation units into the class that shows a smaller CM value.If the composition of observation units in a class changes, the number of local model classes also changes and is re-estimated using SEM-PLS.

2 .
Selecting the number of segments or clusters based on the dendrogram formed.3. Grouping each case into segments according to agglomerative hierarchical cluster analysis 4. Estimating the local model in each segment using SEM-PLS 5. Calculating the closeness measure index (CM index) for each case in each local model 6.Clustering each case or observation unit based on the smallest CM index value formed and recalculating the local model for each cluster until the cluster composition does not change.7. Describe the classes obtained according to the differences between local models.8. Comparing the analysis results on the global model and local model 9. Draw conclusions and make suggestions.

Fig. 5 .
Fig. 5. Mapping of cluster analysis results based on the CM index in Southeast Sulawesi.
is the standard error for λℎ .The rejection region used is H0 rejected if | | >  ∕ 2 ,

Table 1
Operational definition of research variables.

Table 2
Descriptive statistics based on research variable indicators.

Table 3
Indicator validity and reliability test on latent variables.

Table 4
The goodness of fit of the R-square model of toddler nutrition status.
Source: Data processed.

Table 5
Coefficient of structural model parameters.

Table 6
Cluster determination criteria from the results of Ward's method.

Table 7
Descriptive statistics of indicators in segment 1 and segment 2: REBUS and FIMIX methods.

Table 8
Path coefficients for the global model, segment 1, and segment 2.

Table 9
Comparison of loading factors.greater path coefficient value than the global model.The relationship between food factors and service factors Local model 2 has a greater path coefficient value than the global model, and for the relationship between service factors and the nutritional status of toddlers, local model 2 has a greater path coefficient value than the global model.This shows that the results of the analysis using homogeneous data have greater results than heterogeneous data.Next, a comparison of loading factors between the global model, local model 1, and local model 2 is carried out.

Table 10
Comparison of R-square and Q-square values.
2. REBUS-PLS and FIMIX-PLS produce two segments, which is the number of best segments formed.REBUS-PLS: segment 1 consists of 75 sub-districts, and segment 2 consists of 141 sub-districts.In FIMIX-PLS, segment 1 consists of 46 sub-districts, and segment 2 consists of 170 sub-districts.In both analyzes using REBUS-PLS and FIMIX-PLS, the local model has an R2 value that is greater than the global model, even though there is a value in one local model that is smaller than the global model.For this variable, the other local model has a greater value compared to the global model.The REBUS-PLS method provides the best performance in dealing with heterogeneity compared to FIMIX-PLS and PLS. 3. The food factor has a dominant influence on the service factor; therefore, the indicator number of pregnant women receiving Fe1 tablets and Fe3 tablets is the indicator that has the greatest contribution in describing the service factor and then to the nutritional status of children under five, so that the indicator should be given more attention to reducing stunting, wasting, and underweight in Southeast Sulawesi.In addition, the different coefficient results in each model indicate different characteristics for each cluster formed and allow that there are other factors that can affect the occurrence of stunting, wasting, and underweight in each cluster.